Two-stage multi-target joint learning for monaural speech separation

نویسندگان

  • Shuai Nie
  • Shan Liang
  • Wei Xue
  • Xueliang Zhang
  • Wenju Liu
  • Like Dong
  • Hong Yang
چکیده

Recently, supervised speech separation has been extensively studied and shown considerable promise. Due to the temporal continuity of speech, speech auditory features and separation targets present prominent spectro-temporal structures and strong correlations over the time-frequency (T-F) domain, which can be exploited for speech separation. However, many supervised speech separation methods independently model each T-F unit with only one target and much ignore these useful information. In this paper, we propose a two-stage multi-target joint learning method to jointly model the related speech separation targets at the frame level. Systematic experiments show that the proposed approach consistently achieves better separation and generalization performances in the low signal-to-noise ratio(SNR) conditions.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Deep Ensemble Learning for Monaural Speech Separation

Monaural speech separation is a fundamental problem in robust speech processing. Recently, deep neural network (DNN) based speech separation methods, which predict either clean speech or an ideal time-frequency mask, have demonstrated remarkable performance improvement. However, a single DNN with a given window length does not leverage contextual information sufficiently, and the differences be...

متن کامل

Multi-Target Ensemble Learning for Monaural Speech Separation

Speech separation can be formulated as a supervised learning problem where a machine is trained to cast the acoustic features of the noisy speech to a time-frequency mask, or the spectrum of the clean speech. These two categories of speech separation methods can be generally referred as the masking-based and the mapping-based methods, but none of them can perfectly estimate the clean speech, si...

متن کامل

Monaural Multi-Talker Speech Recognition using Factorial Speech Processing Models

A Pascal challenge entitled monaural multi-talker speech recognition was developed, targeting the problem of robust automatic speech recognition against speech like noises which significantly degrades the performance of automatic speech recognition systems. In this challenge, two competing speakers say a simple command simultaneously and the objective is to recognize speech of the target speake...

متن کامل

Supervised Speech Separation Based on Deep Learning: An Overview

Speech separation is the task of separating target speech from background interference. Traditionally, speech separation is studied as a signal processing problem. A more recent approach formulates speech separation as a supervised learning problem, where the discriminative patterns of speech, speakers, and background noise are learned from training data. Over the past decade, many supervised s...

متن کامل

Pitch-based monaural segregation of reverberant speech.

In everyday listening, both background noise and reverberation degrade the speech signal. Psychoacoustic evidence suggests that human speech perception under reverberant conditions relies mostly on monaural processing. While speech segregation based on periodicity has achieved considerable progress in handling additive noise, little research in monaural segregation has been devoted to reverbera...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015